Transcription factor binding site detection using convolutional neural networks with a functional group-based data representation
نویسندگان
چکیده
Abstract Transcription factors (TFs) play an essential role in molecular biology by regulating gene expression. The binding sites of TFs can vary a large amount and the numerous possible locations make their detection challenging issue. Recently, several machine learning approaches using nucleotide sequence data were applied to classify DNA sequences regarding Factor Binding Sites (TFBS). We propose novel training strategy without traditional 1D nucleotide-based representation instead 2D topological matrix sub-nucleotide chemical functional groups substantially defining protein ability fragments. train convolutional neural networks this Functional Group Representation (FGDR) solve TFBS classification task. compare our results with efficiency previous show that from FGDR has benefits classification. Moreover, we reason deep produces competitive while only introducing pre-processing conversion step. Finally, employing ensemble models representations for network higher performance than any single input approaches.
منابع مشابه
Convolutional Neural Networks using Logarithmic Data Representation
Recent advances in convolutional neural networks have considered model complexity and hardware efficiency to enable deployment onto embedded systems and mobile devices. For example, it is now well-known that the arithmetic operations of deep networks can be encoded down to 8-bit fixed-point without significant deterioration in performance. However, further reduction in precision down to as low ...
متن کاملDeepSite: protein-binding site predictor using 3D-convolutional neural networks
Motivation An important step in structure-based drug design consists in the prediction of druggable binding sites. Several algorithms for detecting binding cavities, those likely to bind to a small drug compound, have been developed over the years by clever exploitation of geometric, chemical and evolutionary features of the protein. Results Here we present a novel knowledge-based approach th...
متن کاملConvolutional Kitchen Sinks for Transcription Factor Binding Site Prediction
We present a simple and efficient method for prediction of transcription factor binding sites from DNA sequence. Our method computes a random approximation of a convolutional kernel feature map from DNA sequence and then learns a linear model from the approximated feature map. Our method outperforms state-ofthe-art deep learning methods on five out of six test datasets from the ENCODE consortiu...
متن کاملFeature Based Representation and Detection of Transcription Factor Binding Sites
The prediction of transcription factor binding sites is an important problem, since it reveals information about the transcriptional regulation of genes. A commonly used representation of these sites are position specific weight matrices which show weak predictive power. We introduce a feature-based modelling approach, which is able to deal with various kind of biological properties of binding ...
متن کاملDiscovery of Transcription Factor Binding Sites with Deep Convolutional Neural Networks
Transcription factors are key gene regulators, responsible for modulating the conversion of genetic information from DNA to RNA. Though these factors can be discovered experimentally, computational biologists have become increasingly interested in learning transcription factor binding sites from sequence data computationally. Though traditional machine learning architectures, including support ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Physics: Conference Series
سال: 2021
ISSN: ['1742-6588', '1742-6596']
DOI: https://doi.org/10.1088/1742-6596/1824/1/012001